NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Computational Benefits of Intermediate Rewards for Goal-Reaching Policy Learning

https://doi.org/10.1613/jair.1.13326

Zhai, Yuexiang; Baek, Christina; Zhou, Zhengyuan; Jiao, Jiantao; Ma, Yi (January 2022, Journal of Artificial Intelligence Research)

Many goal-reaching reinforcement learning (RL) tasks have empirically verified that rewarding the agent on subgoals improves convergence speed and practical performance. We attempt to provide a theoretical framework to quantify the computational benefits of rewarding the completion of subgoals, in terms of the number of synchronous value iterations. In particular, we consider subgoals as one-way intermediate states, which can only be visited once per episode and propose two settings that consider these one-way intermediate states: the one-way single-path (OWSP) and the one-way multi-path (OWMP) settings. In both OWSP and OWMP settings, we demonstrate that adding intermediate rewards to subgoals is more computationally efficient than only rewarding the agent once it completes the goal of reaching a terminal state. We also reveal a trade-off between computational complexity and the pursuit of the shortest path in the OWMP setting: adding intermediate rewards significantly reduces the computational complexity of reaching the goal but the agent may not find the shortest path, whereas with sparse terminal rewards, the agent finds the shortest path at a significantly higher computational cost. We also corroborate our theoretical results with extensive experiments on the MiniGrid environments using Q-learning and some popular deep RL algorithms.
more » « less
Full Text Available
Convolutional Normalization: Improving Deep Convolutional Network Robustness and Training

Liu, Sheng; Li, Xiao; Zhai, Yuexiang; You, Chong; Zhu, Zhihui; Fernandez-Granda, Carlos; Qu, Qing (January 2021, Advances in neural information processing systems)

Full Text Available
Convolutional Normalization: Improving Deep Convolutional Network Robustness and Training

Liu, Sheng; Li, Xiao; Zhai, Yuexiang; You, Chong; Zhu, Zhihui; Fernandez-Granda, Carlos; Qu, Qing (January 2021, Advances in neural information processing systems)

Normalization techniques have become a basic component in modern convolutional neural networks (ConvNets). In particular, many recent works demonstrate that promoting the orthogonality of the weights helps train deep models and improve robustness. For ConvNets, most existing methods are based on penalizing or normalizing weight matrices derived from concatenating or flattening the convolutional kernels. These methods often destroy or ignore the benign convolutional structure of the kernels; therefore, they are often expensive or impractical for deep ConvNets. In contrast, we introduce a simple and efficient Convolutional Normalization'' (ConvNorm) method that can fully exploit the convolutional structure in the Fourier domain and serve as a simple plug-and-play module to be conveniently incorporated into any ConvNets. Our method is inspired by recent work on preconditioning methods for convolutional sparse coding and can effectively promote each layer's channel-wise isometry. Furthermore, we show that our ConvNorm can reduce the layerwise spectral norm of the weight matrices and hence improve the Lipschitzness of the network, leading to easier training and improved robustness for deep ConvNets. Applied to classification under noise corruptions and generative adversarial network (GAN), we show that the ConvNorm improves the robustness of common ConvNets such as ResNet and the performance of GAN. We verify our findings via numerical experiments on CIFAR and ImageNet. Our implementation is available online at \url{https://github.com/shengliu66/ConvNorm}.
more » « less
Full Text Available
Complete Dictionary Learning via L4-Norm Maximization over the Orthogonal Group

Zhai, Yuexiang; Yang, Zitong; Liao, Zhenyu; Wright, John; Ma, Yi (January 2020, Journal of machine learning research)
null (Ed.)
Full Text Available

Search for: All records